Using Word-Pair Identifier to Improve Chinese Input System

نویسنده

  • Jia-Lin Tsai
چکیده

This paper presents a word-pair (WP) identifier that can be used to resolve homonym/segmentation ambiguities and perform syllable-to-word (STW) conversion effectively for improving Chinese input systems. The experiment results show the following: (1) the WP identifier is able to achieve tonal (syllables with four tones) and toneless (syllables without four tones) STW accuracies of 98.5% and 90.7%, respectively, among the identified word-pairs; (2) while applying the WP identifier, together with the Microsoft input method editor 2003 and an optimized bigram model, the tonal and toneless STW improvements of the two input systems are 27.5%/18.9% and 22.1%/18.8%, respectively.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Applying a Mix Word-Pair Identifier to the Chinese Syllable-to-Word Conversion Problem

This paper describes a mix word-pair mix-WP) identifier to resolve homonym/segmentation ambiguities as well as perform STW conversion effectively for Chinese input. The mix-WP identifier includes a specific word-pair (SWP) identifier and a common wordpair (CWP) identifier. It is designed as a supporting processing with Chinese input systems. Our experiments show that by applying the mix-WP iden...

متن کامل

Applying Meaningful Word-Pair Identifier to the Chinese Syllable-to-Word Conversion Problem

Syllable-to-word (STW) conversion is a frequently used Chinese input method that is fundamental to syllable/speech understanding. The two major problems with STW conversion are the segmentation of syllable input and the ambiguities caused by homonyms. This paper describes a meaningful word-pair (MWP) identifier that can be used to resolve homonym/segmentation ambiguities and perform STW convers...

متن کامل

Applying an NVEF Word-Pair Identifier to the Chinese Syllable-to-Word Conversion Problem

Syllable-to-word (STW) conversion is important in Chinese phonetic input methods and speech recognition. There are two major problems in the STW conversion: (1) resolving the ambiguity caused by homonyms; (2) determining the word segmentation. This paper describes a noun-verb event-frame (NVEF) word identifier that can be used to solve these problems effectively. Our approach includes (a) an NV...

متن کامل

Word Sense Disambiguation and Sense-Based NV Event Frame

Word sense is ambiguous in natural language processing (NLP). This phenomenon is particularly keen in cases involving noun-verb (NV) word-pairs. This paper describes a sense-based noun-verb event frame (NVEF) identifier that can be used to disambiguate word sense in Chinese sentences effectively. A knowledge representation system (the NVEF-KR tree) for the NVEF sense-pair identifier is also pro...

متن کامل

Using Word Support Model to Improve Chinese Input System

This paper presents a word support model (WSM). The WSM can effectively perform homophone selection and syllable-word segmentation to improve Chinese input systems. The experimental results show that: (1) the WSM is able to achieve tonal (syllables input with four tones) and toneless (syllables input without four tones) syllable-to-word (STW) accuracies of 99% and 92%, respectively, among the c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005